6 research outputs found

    Modelo acĂșstico de lĂ­ngua inglesa falada por portugueses

    Get PDF
    Trabalho de projecto de mestrado em Engenharia InformĂĄtica, apresentado Ă  Universidade de Lisboa, atravĂ©s da Faculdade de CiĂȘncias, 2007No contexto do reconhecimento robusto de fala baseado em modelos de Markov nĂŁo observĂĄveis (do inglĂȘs Hidden Markov Models - HMMs) este trabalho descreve algumas metodologias e experiĂȘncias tendo em vista o reconhecimento de oradores estrangeiros. Quando falamos em Reconhecimento de Fala falamos obrigatoriamente em Modelos AcĂșsticos tambĂ©m. Os modelos acĂșsticos reflectem a maneira como pronunciamos/articulamos uma lĂ­ngua, modelando a sequĂȘncia de sons emitidos aquando da fala. Essa modelação assenta em segmentos de fala mĂ­nimos, os fones, para os quais existe um conjunto de sĂ­mbolos/alfabetos que representam a sua pronunciação. É no campo da fonĂ©tica articulatĂłria e acĂșstica que se estuda a representação desses sĂ­mbolos, sua articulação e pronunciação. Conseguimos descrever palavras analisando as unidades que as constituem, os fones. Um reconhecedor de fala interpreta o sinal de entrada, a fala, como uma sequĂȘncia de sĂ­mbolos codificados. Para isso, o sinal Ă© fragmentado em observaçÔes de sensivelmente 10 milissegundos cada, reduzindo assim o factor de anĂĄlise ao intervalo de tempo onde as caracterĂ­sticas de um segmento de som nĂŁo variam. Os modelos acĂșsticos dĂŁo-nos uma noção sobre a probabilidade de uma determinada observação corresponder a uma determinada entidade. É, portanto, atravĂ©s de modelos sobre as entidades do vocabulĂĄrio a reconhecer que Ă© possĂ­vel voltar a juntar esses fragmentos de som. Os modelos desenvolvidos neste trabalho sĂŁo baseados em HMMs. Chamam-se assim por se fundamentarem nas cadeias de Markov (1856 - 1922): sequĂȘncias de estados onde cada estado Ă© condicionado pelo seu anterior. Localizando esta abordagem no nosso domĂ­nio, hĂĄ que construir um conjunto de modelos - um para cada classe de sons a reconhecer - que serĂŁo treinados por dados de treino. Os dados sĂŁo ficheiros ĂĄudio e respectivas transcriçÔes (ao nĂ­vel da palavra) de modo a que seja possĂ­vel decompor essa transcrição em fones e alinhĂĄ-la a cada som do ficheiro ĂĄudio correspondente. Usando um modelo de estados, onde cada estado representa uma observação ou segmento de fala descrita, os dados vĂŁo-se reagrupando de maneira a criar modelos estatĂ­sticos, cada vez mais fidedignos, que consistam em representaçÔes das entidades da fala de uma determinada lĂ­ngua. O reconhecimento por parte de oradores estrangeiros com pronuncias diferentes da lĂ­ngua para qual o reconhecedor foi concebido, pode ser um grande problema para precisĂŁo de um reconhecedor. Esta variação pode ser ainda mais problemĂĄtica que a variação dialectal de uma determinada lĂ­ngua, isto porque depende do conhecimento que cada orador tĂȘm relativamente Ă  lĂ­ngua estrangeira. Usando para uma pequena quantidade ĂĄudio de oradores estrangeiros para o treino de novos modelos acĂșsticos, foram efectuadas diversas experiĂȘncias usando corpora de Portugueses a falar InglĂȘs, de PortuguĂȘs Europeu e de InglĂȘs. Inicialmente foi explorado o comportamento, separadamente, dos modelos de Ingleses nativos e Portugueses nativos, quando testados com os corpora de teste (teste com nativos e teste com nĂŁo nativos). De seguida foi treinado um outro modelo usando em simultĂąneo como corpus de treino, o ĂĄudio de Portugueses a falar InglĂȘs e o de Ingleses nativos. Uma outra experiĂȘncia levada a cabo teve em conta o uso de tĂ©cnicas de adaptação, tal como a tĂ©cnica MLLR, do inglĂȘs Maximum Likelihood Linear Regression. Esta Ășltima permite a adaptação de uma determinada caracterĂ­stica do orador, neste caso o sotaque estrangeiro, a um determinado modelo inicial. Com uma pequena quantidade de dados representando a caracterĂ­stica que se quer modelar, esta tĂ©cnica calcula um conjunto de transformaçÔes que serĂŁo aplicadas ao modelo que se quer adaptar. Foi tambĂ©m explorado o campo da modelação fonĂ©tica onde estudou-se como Ă© que o orador estrangeiro pronuncia a lĂ­ngua estrangeira, neste caso um PortuguĂȘs a falar InglĂȘs. Este estudo foi feito com a ajuda de um linguista, o qual definiu um conjunto de fones, resultado do mapeamento do inventĂĄrio de fones do InglĂȘs para o PortuguĂȘs, que representam o InglĂȘs falado por Portugueses de um determinado grupo de prestĂ­gio. Dada a grande variabilidade de pronĂșncias teve de se definir este grupo tendo em conta o nĂ­vel de literacia dos oradores. Este estudo foi posteriormente usado na criação de um novo modelo treinado com os corpora de Portugueses a falar InglĂȘs e de Portugueses nativos. Desta forma representamos um reconhecedor de PortuguĂȘs nativo onde o reconhecimento de termos ingleses Ă© possĂ­vel. Tendo em conta a temĂĄtica do reconhecimento de fala este projecto focou tambĂ©m a recolha de corpora para portuguĂȘs europeu e a compilação de um lĂ©xico de PortuguĂȘs europeu. Na ĂĄrea de aquisição de corpora o autor esteve envolvido na extracção e preparação dos dados de fala telefĂłnica, para posterior treino de novos modelos acĂșsticos de portuguĂȘs europeu. Para compilação do lĂ©xico de portuguĂȘs europeu usou-se um mĂ©todo incremental semi-automĂĄtico. Este mĂ©todo consistiu em gerar automaticamente a pronunciação de grupos de 10 mil palavras, sendo cada grupo revisto e corrigido por um linguista. Cada grupo de palavras revistas era posteriormente usado para melhorar as regras de geração automĂĄtica de pronunciaçÔes.The tremendous growth of technology has increased the need of integration of spoken language technologies into our daily applications, providing an easy and natural access to information. These applications are of different nature with different user’s interfaces. Besides voice enabled Internet portals or tourist information systems, automatic speech recognition systems can be used in home user’s experiences where TV and other appliances could be voice controlled, discarding keyboards or mouse interfaces, or in mobile phones and palm-sized computers for a hands-free and eyes-free manipulation. The development of these systems causes several known difficulties. One of them concerns the recognizer accuracy on dealing with non-native speakers with different phonetic pronunciations of a given language. The non-native accent can be more problematic than a dialect variation on the language. This mismatch depends on the individual speaking proficiency and speaker’s mother tongue. Consequently, when the speaker’s native language is not the same as the one that was used to train the recognizer, there is a considerable loss in recognition performance. In this thesis, we examine the problem of non-native speech in a speaker-independent and large-vocabulary recognizer in which a small amount of non-native data was used for training. Several experiments were performed using Hidden Markov models, trained with speech corpora containing European Portuguese native speakers, English native speakers and English spoken by European Portuguese native speakers. Initially it was explored the behaviour of an English native model and non-native English speakers’ model. Then using different corpus weights for the English native speakers and English spoken by Portuguese speakers it was trained a model as a pool of accents. Through adaptation techniques it was used the Maximum Likelihood Linear Regression method. It was also explored how European Portuguese speakers pronounce English language studying the correspondences between the phone sets of the foreign and target languages. The result was a new phone set, consequence of the mapping between the English and the Portuguese phone sets. Then a new model was trained with English Spoken by Portuguese speakers’ data and Portuguese native data. Concerning the speech recognition subject this work has other two purposes: collecting Portuguese corpora and supporting the compilation of a Portuguese lexicon, adopting some methods and algorithms to generate automatic phonetic pronunciations. The collected corpora was processed in order to train acoustic models to be used in the Exchange 2007 domain, namely in Outlook Voice Access

    Familial hypercholesterolaemia in children and adolescents from 48 countries: a cross-sectional study

    Get PDF
    Background: Approximately 450 000 children are born with familial hypercholesterolaemia worldwide every year, yet only 2·1% of adults with familial hypercholesterolaemia were diagnosed before age 18 years via current diagnostic approaches, which are derived from observations in adults. We aimed to characterise children and adolescents with heterozygous familial hypercholesterolaemia (HeFH) and understand current approaches to the identification and management of familial hypercholesterolaemia to inform future public health strategies. Methods: For this cross-sectional study, we assessed children and adolescents younger than 18 years with a clinical or genetic diagnosis of HeFH at the time of entry into the Familial Hypercholesterolaemia Studies Collaboration (FHSC) registry between Oct 1, 2015, and Jan 31, 2021. Data in the registry were collected from 55 regional or national registries in 48 countries. Diagnoses relying on self-reported history of familial hypercholesterolaemia and suspected secondary hypercholesterolaemia were excluded from the registry; people with untreated LDL cholesterol (LDL-C) of at least 13·0 mmol/L were excluded from this study. Data were assessed overall and by WHO region, World Bank country income status, age, diagnostic criteria, and index-case status. The main outcome of this study was to assess current identification and management of children and adolescents with familial hypercholesterolaemia. Findings: Of 63 093 individuals in the FHSC registry, 11 848 (18·8%) were children or adolescents younger than 18 years with HeFH and were included in this study; 5756 (50·2%) of 11 476 included individuals were female and 5720 (49·8%) were male. Sex data were missing for 372 (3·1%) of 11 848 individuals. Median age at registry entry was 9·6 years (IQR 5·8-13·2). 10 099 (89·9%) of 11 235 included individuals had a final genetically confirmed diagnosis of familial hypercholesterolaemia and 1136 (10·1%) had a clinical diagnosis. Genetically confirmed diagnosis data or clinical diagnosis data were missing for 613 (5·2%) of 11 848 individuals. Genetic diagnosis was more common in children and adolescents from high-income countries (9427 [92·4%] of 10 202) than in children and adolescents from non-high-income countries (199 [48·0%] of 415). 3414 (31·6%) of 10 804 children or adolescents were index cases. Familial-hypercholesterolaemia-related physical signs, cardiovascular risk factors, and cardiovascular disease were uncommon, but were more common in non-high-income countries. 7557 (72·4%) of 10 428 included children or adolescents were not taking lipid-lowering medication (LLM) and had a median LDL-C of 5·00 mmol/L (IQR 4·05-6·08). Compared with genetic diagnosis, the use of unadapted clinical criteria intended for use in adults and reliant on more extreme phenotypes could result in 50-75% of children and adolescents with familial hypercholesterolaemia not being identified. Interpretation: Clinical characteristics observed in adults with familial hypercholesterolaemia are uncommon in children and adolescents with familial hypercholesterolaemia, hence detection in this age group relies on measurement of LDL-C and genetic confirmation. Where genetic testing is unavailable, increased availability and use of LDL-C measurements in the first few years of life could help reduce the current gap between prevalence and detection, enabling increased use of combination LLM to reach recommended LDL-C targets early in life

    Characterisation of microbial attack on archaeological bone

    Get PDF
    As part of an EU funded project to investigate the factors influencing bone preservation in the archaeological record, more than 250 bones from 41 archaeological sites in five countries spanning four climatic regions were studied for diagenetic alteration. Sites were selected to cover a range of environmental conditions and archaeological contexts. Microscopic and physical (mercury intrusion porosimetry) analyses of these bones revealed that the majority (68%) had suffered microbial attack. Furthermore, significant differences were found between animal and human bone in both the state of preservation and the type of microbial attack present. These differences in preservation might result from differences in early taphonomy of the bones. © 2003 Elsevier Science Ltd. All rights reserved

    Evaluation of a quality improvement intervention to reduce anastomotic leak following right colectomy (EAGLE): pragmatic, batched stepped-wedge, cluster-randomized trial in 64 countries

    No full text
    Background Anastomotic leak affects 8 per cent of patients after right colectomy with a 10-fold increased risk of postoperative death. The EAGLE study aimed to develop and test whether an international, standardized quality improvement intervention could reduce anastomotic leaks. Methods The internationally intended protocol, iteratively co-developed by a multistage Delphi process, comprised an online educational module introducing risk stratification, an intraoperative checklist, and harmonized surgical techniques. Clusters (hospital teams) were randomized to one of three arms with varied sequences of intervention/data collection by a derived stepped-wedge batch design (at least 18 hospital teams per batch). Patients were blinded to the study allocation. Low- and middle-income country enrolment was encouraged. The primary outcome (assessed by intention to treat) was anastomotic leak rate, and subgroup analyses by module completion (at least 80 per cent of surgeons, high engagement; less than 50 per cent, low engagement) were preplanned. Results A total 355 hospital teams registered, with 332 from 64 countries (39.2 per cent low and middle income) included in the final analysis. The online modules were completed by half of the surgeons (2143 of 4411). The primary analysis included 3039 of the 3268 patients recruited (206 patients had no anastomosis and 23 were lost to follow-up), with anastomotic leaks arising before and after the intervention in 10.1 and 9.6 per cent respectively (adjusted OR 0.87, 95 per cent c.i. 0.59 to 1.30; P = 0.498). The proportion of surgeons completing the educational modules was an influence: the leak rate decreased from 12.2 per cent (61 of 500) before intervention to 5.1 per cent (24 of 473) after intervention in high-engagement centres (adjusted OR 0.36, 0.20 to 0.64; P < 0.001), but this was not observed in low-engagement hospitals (8.3 per cent (59 of 714) and 13.8 per cent (61 of 443) respectively; adjusted OR 2.09, 1.31 to 3.31). Conclusion Completion of globally available digital training by engaged teams can alter anastomotic leak rates. Registration number: NCT04270721 (http://www.clinicaltrials.gov)
    corecore